Packages and installation
The term ‘package’ is ambiguous. It can refer to an operating system package, e.g. a .deb
file installed by dpkg
, a Python distribution package installed by pip
or conda
, or a container of modules with respect to the Python import system. In this document, the unqualified term ‘package’ refers to a Python distribution package, and the two other uses are referred to by ‘OS package’ and ‘import package’.
Prefixes
A prefix is a set of directories which contain everything needed to run Python code in a specific context, including virtual or Conda environments. The current prefix can be obtained by sys.prefix
.
OS prefix
Without activated Python environment, Python code is run by an OS Python interpreter, typically /usr/bin/python3
, having access to packages in the OS prefix /usr/
.
In particular, the Python standard library is in /usr/lib/python<version>/
, other Python packages in /usr/local/lib/python<version>/dist-packages
, and Python applications and related executables are in /usr/local/bin/
.
For Python code run by a user, this is augmented by the quasi-prefix ~/.local/
, where additional Python packages can be installed below ~/.local/lib/python<version>/
and Python applications in ~/.local/bin/
.
Following PEP 668, the OS prefix (including its augmentation by ~/.local/
) is considered ‘externally managed’ by the OS package manager, and the user should not modify it using pip
, which displays an error message if attempted. Though discouraged, it is still possible to modify /usr/
with sudo pip install --break-system-packages
, or ~/.local/
with pip install --user --break-system-packages
.
Instead, Python packages should be installed using the OS package manager if possible, and using pipx
otherwise. pipx
transparently creates a separate environment for each application in ~/.local/pipx/venvs
, installs it and its dependencies there, and creates scripts in ~/.local/bin/
.
A Python script written by the user should either only use Python packages installed via the OS package manager, or it should be packaged as an application and installed withs its dependencies via pipx .
, possibly with the --editable
option for development.
Environment prefixes
With activated Python environment, Python code is run by a Python interpreter installed in the environment, typically <prefix>/bin/python
, having access to the environment prefix <envs directory>/<env name>/
, for example:
~/anaconda3/envs/<env name>/
~/.conda/envs/<env name>/
~/.local/pipx/venvs/<env name>/
~/.virtualenvs/<env name>/
When an environment is activated, among other things PATH
is modified by prepending <prefix>/bin
, so that Python interpreters from the environment are found first. The same directory contains Python applications installed in that environment; during installation, the application script’s shebang is modified to #!<prefix>/bin/python
.
Prefix structure
The structure of a prefix is defined by an ‘install scheme’, a mapping from a set of eight identifiers to directories with different purposes:
identifier | contents of directory |
---|---|
stdlib |
Python standard library |
platstdlib |
Python standard library (platform-specific) |
purelib |
additionally installed Python packages |
platlib |
additionally installed Python packages (platform-specific) |
include |
header files for the Python C-API |
platinclude |
header files for the Python C-API (platform-specific) |
scripts |
Python application script files and other executables |
data |
data files |
The current install scheme can be obtained by sysconfig.get_paths()
. Examples:
- without activated environment
-
{'stdlib': '/usr/lib/python<version>', 'platstdlib': '/usr/lib/python<version>', 'purelib': '/usr/local/lib/python<version>/dist-packages', 'platlib': '/usr/local/lib/python<version>/dist-packages', 'include': '/usr/include/python<version>', 'platinclude': '/usr/include/python<version>', 'scripts': '/usr/local/bin', 'data': '/usr/local' }
- with activated Conda environment
-
{'stdlib': '<prefix>/lib/python<version>', 'platstdlib': '<prefix>/lib/python<version>', 'purelib': '<prefix>/lib/python<version>/site-packages', 'platlib': '<prefix>/lib/python<version>/site-packages', 'include': '<prefix>/include/python<version>', 'platinclude': '<prefix>/include/python<version>', 'scripts': '<prefix>/bin', 'data': '<prefix>' }
- with activated
venv
environment (created bypipx
) -
{'stdlib': '/usr/lib/python<version>', 'platstdlib': '<prefix>/lib/python<version>', 'purelib': '<prefix>/lib/python<version>/site-packages', 'platlib': '<prefix>/lib/python<version>/site-packages', 'include': '/usr/include/python<version>', 'platinclude': '/usr/include/python<version>', 'scripts': '<prefix>/bin', 'data': '<prefix>' }
Install schemes often do not distinguish between platform-specific and non-platform-specific files, and additionally installed Python packages are typically in a subdirectory dist-packages
or site-packages
of the standard library directory. It is also possible to combine OS directories with environment directories.
Binary distribution format ‘wheel’ and its installation
A wheel is a zip file with a name of the form:
{distribution}-{version}-{python}-{abi}-{platform}.whl
.
Optionally, a build number can be included between version
and python
.
distribution |
name of the packaged software |
version |
version of the packaged software |
python |
Python language implementation and version, e.g. py3 |
abi |
application binary interface, none for pure Python |
platform |
platform the package is built for, any for pure Python |
The zip file contains the implementation files, metadata in a subdirectory {distribution}-{version}.dist-info/
, and it can contain data files in a subdirectory {distribution}-{version}.data/
.
During installation the zip file is unpacked either into the prefix’s purelib
or platlib
(depending on whether it is a pure Python package or not), i.e. typically <prefix>/lib/python<version>/site-packages
. Implementation files and {distribution}-{version}.dist-info/
remain in this directory. Python files in the zip file’s root thereby become accessible as import modules, and directories as import packages.
If a subdirectory {distribution}-{version}.data/
exists, the contents of its subdirectories named after the install scheme identifiers are moved into the corresponding directories specified in the install scheme, and the empty subdirectory {distribution}-{version}.data/
is removed.
This can be used e.g. to install Python application script files and other executables via a subdirectory {distribution}-{version}.data/scripts/
. Since in an environment the identifier data
typically points to the environment directory itself, arbitrary locations within it can be targeted, e.g. to install a Jupyter kernel specification with {distribution}-{version}.data/data/share/jupyter/kernels
or a JupyterLab extension with {distribution}-{version}.data/data/share/jupyter/labextensions
.
Binary distribution format ‘conda’ and its installation
TODO